-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOP-18631] - add partial support for ArrayType #8
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8 +/- ##
============================================
- Coverage 100.00% 71.26% -28.74%
============================================
Files 1 1
Lines 22 87 +65
Branches 4 27 +23
============================================
+ Hits 22 62 +40
- Misses 0 25 +25 ☔ View full report in Codecov by Sentry. |
src/main/scala/ru/mts/doetl/sparkdialectextensions/ClickhouseDialectExtension.scala
Outdated
Show resolved
Hide resolved
src/test/scala/ru/mts/doetl/sparkdialectextensions/ClickhouseDialectTest.scala
Outdated
Show resolved
Hide resolved
But |
no, it first converts in scala for arrays: https://github.com/apache/spark/blob/c3bb64e367aca5b49c691a9b64bc3e2965b7c80e/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L712
in scala* |
It is important to have a full stacktrace. |
|
src/main/scala/ru/mts/doetl/sparkdialectextensions/ClickhouseDialectExtension.scala
Outdated
Show resolved
Hide resolved
1a55bb6
to
35f8e8e
Compare
src/main/scala/ru/mts/doetl/sparkdialectextensions/ClickhouseDialectExtension.scala
Show resolved
Hide resolved
src/main/scala/ru/mts/doetl/sparkdialectextensions/ClickhouseDialectExtension.scala
Show resolved
Hide resolved
35f8e8e
to
3e3fc6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New type mapping should be mentioned in documentation
Change Summary
Add partial support:
Array(T)
->
ArrayType(T)
:Array(String)
ArrayType(StringType)
Array(String)
Array(String)
ArrayType(ByteType)
Array(Int8)
Array(Int8)
ArrayType(ShortType)
ArrayType(LongType)
Array(Int64)
Array(Int64)
Array(Decimal(M, N))
ArrayType(DecimalType(M, N))
Array(Decimal(M, N))
Array(Decimal(M, N))
ArrayType(TimestampType)
ArrayType(Date)
Array(Date)
Array(Date)
ArrayType(FloatType)
Array(Float32)
Array(Float32)
ArrayType(DoubleType)
the problem with short type and double type is next:
in native spark typeName for nested type in Array is get via getJDBCType:
in default
getJDBCType
:case ShortType => Option(JdbcType("INTEGER", java.sql.Types.SMALLINT))
it converts
ShortType
to JavaJdbcType("INTEGER")
type andArrayType(ShortType) -> Array(Integers)
but as we customize behaviour for
ShortType
toInt16
in clickhouse it tries to convertArrayType(ShortType) -> Array(Int16)
, however in Java there is noInt16
type, therefore it fails with error:Java.lang.IllegalArgumentException: Unknown data type: int16
, so for current spark implementation there is a choice:Related issue number
Checklist
Auto-commit applied. A retry was triggered due to a failure in formatting/linting checks.